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ABSTRACT 

In this work we propose a metric to assess academic pro¬ 
ductivity based on publication outputs. We are interested 
in knowing how well a research group in an area of knowledge 
is doing relatively to a pre-selected set of reference groups, 
where each group is composed by academics or researchers. 
To assess academic productivity we propose a new metric, 
which we call P-score. Our metric P-score assigns weights to 
venues using only the publication patterns of selected refer¬ 
ence groups. This implies that P-score does not depend on 
citation-data and thus, that it is simpler to compute partic¬ 
ularly in contexts in which citation data is not easily avail¬ 
able. Also, preliminary experiments suggest that P-score 
preserves strong correlation with citation-based metrics. 
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I. INTRODUCTION 

The assessment of academic productivity usually involves 
the association of metrics with the researchers or groups of 
researchers one wants to evaluate. Funding agencies, uni¬ 
versity officials, and department chairs are examples of en¬ 
tities interested in these metrics, as these have application 
in a variety of practical situations. There are also cases in 
which one needs to compare researchers working on a same 
sub-area of knowledge, some examples are finding review 
peers, constructing program committees or compiling teams 
for grants. 

Today, the most reliable and complete way to compare 
researchers is by compiling information on their academic 
output such as number of publications, citation based met¬ 
rics, number of undergraduate and graduate students under 
supervision, number of advised masters and PhD theses, 
and participation in conferences and in technical commit¬ 
tees. Some councils also use extensive surveys to compile 
qualitative information on features associated with the pro¬ 
grams. 


However, as compiling this information is not a simple 
task and takes a long time, it is a common procedure to use 
just citation data to gain quick insights into the productivity 
of research groups and academics. But, given that compiling 
citation counts requires access to the contents of a large 
pool of publications, which is not always available, new and 
complementary metrics, such as P-score, are a necessity. 

The notion of academic productivity is intrinsically as¬ 
sociated with the notion of reputation. And although the 
concept of reputation lacks on definition, we can see it as a 
simple property of an individual or group which measures 
their academic impact in the world and which we can asso¬ 
ciate metrics with. To measure the reputation of researchers, 
it is a common procedure to use the publication venues they 
publish in. Higher the impact of a venue, higher is consid¬ 
ered the reputation of the researchers who publish in it. We 
use this idea of transferring reputation through publications 
to introduce a new metric called P-score. 

2. THE P-SCORE APPROACH 

The question we address in this work is: How to model re¬ 
search groups, researchers and venues to capture the notion 
of relevance or importance of each, using only information 
about (i) the relationship of groups and members and (ii) 
the list of publieation records of each member, without us¬ 
ing paper contents or eitation counts? Working with this 
question, we emerged with a metric, which we call P-Score. 

2.1 Overview and Assumptions 

The basic idea of P-score is to associate a reputation with 
publication venues based on the publication patterns of a 
set of reference groups of researchers in a given area or sub- 
area of knowledge. For now we consider that it is possible 
to select such references, even if it might be controversial. 

We assume that the reputation of a research group is 
strongly influenced by the reputation of its members, which 
is largely dependent on their publication records. P-score is 
based on the following assumptions: 

1. A researcher or a group member conveys reputation to 
a venue proportionally to its own reputation. 

2. The reputation of a researcher is proportional to the 
reputation of the venues in which he/she publishes. 


Once a reference group in a given area is selected, the repu¬ 
tation of members in this group is transferred to the venues. 
A Markov chain model can then be built from these ideas. 

2.2 Notation and Publication Counts 

Before developing the model, we introduce some notation. 
Table [T] summarizes the notation and definitions used in this 
work. We use ui and j as indexes for research groups and 
the venues where they publish, respectively. The research 
groups used as reputation sources are referred to jointly as 
the reference groups. Consider a chosen set T of reference 
groups, and let T be its cardinality. Let V be the set of all 
venues Vj where the groups in T publish, and V the total 
number of venues in the set V. Members of research group 
LO publish in subset Vu, C V with cardinality VL = |Vu,|. 


Table 1: Notation 


r 

set of reference groups 

T 

cardinality of T 

UJ 

a research group in T 

V 

set of venues where the researchers in T publish 

V 

cardinality of V 

Vu: 

set of venues where the researchers of group oj 


publish 

Vu: 

cardinality of Vcj 


the venue where members of a group in T 


publishes at 

N{uj,Vj) 

total number of distinct papers published by 


group OJ in venue Vj 

N(vj) 

total number of papers published in venue Vj 

N{'w) 

total number of publications of group oj 

D(vj) 

number of distinct authors publishing in venue Vj 


reputation of group u £ 'T 


reputation of venue vj £ V 


We define a function N that counts the papers published 
by research groups and the papers published at venues. Let 
N{u}, Vj) be the total number of distinct papers published by 
research group lo in venue Vj and let N{vj) and N{w) be the 
total number of papers published in venue Vj and the total 
number of publications of group to during the observation 
period, respectively. That is: 


V 


N{w) 

T 

N{v,) 

W = 1 


2.3 A Markov Model of Reputation 

From Assumption 1 , the reputation of reference group w 
is defined as: 


V 

7 u, = y ) Vj X Uwj ( 1 ) 

t=i 


where 


^ N{aJ,Vj) 
N{vj) 


( 2 ) 


is the fraction of publications of venue Vj that are from re¬ 
search group oj and V is the number of venues. 

Let D(vj) be the number of distinct authors that publish 
in venue Vj and T the number of reference groups. From 


Assumption 2 , the reputation of venue Vj is dehned as: 

T 

Vj = y ) 7 u, X fiwj ( 3 ) 

W = 1 


where 


^■wj '— X 


N{uJ,Vj) 

N{w) 


-t- (1 — d) X 


-D(vj) 
Ei, D{vk) 


( 4 ) 


combines the fraction of publications of group uj that are 
from venue Vj and the fraction of distinct authors that pub¬ 
lish in Vj. The intuition for this formulation is venues that 
receive publications from a small set of authors are most 
likely to have lower reputation, e.g. local workshops may 
receive a large amount of publications but the total num¬ 
ber of distinct authors tend to be small. The parameter d 
(0 < d < 1) controls the relative importance between the 
volume of publications that Vj receives from a group lo and 
the total number of authors publishing there. 

If d = 1 then the reputation of the publication venues is 
totally derived from the reference groups. If d = 0 then 
the reputation of the publication venues is totally derived 
from the amount of distinct authors (from reference groups 
or not) publishing there. We noticed that varying d does 
have an impact on venue weights. 

Let P be a (T -I- IL) X (T -I- F) square matrix such that 
element pmn = 0 if either m,n < T or m,n > T. In addition, 
Pmn — jdm.n — T for m < T,n > T and pmn — f^m — T,n for 
m > T,n < T. Note that, since E^^i ^ 

1 < j < V and E^i = 4 4 or all 1 < ic < T then P 
defines a Markov chain. In addition, the Markov chain is 
periodic and has the following structure: 





■ 0 

0 

Pii 

... Piv 

0 

P12 


0 

0 

Pti 

... Ptv 

P21 

0 


ail 

... aTi 

0 

... 0 




. aiv 

CtTV 

0 

... 0 . 



From decomposition theory, see [ 2 ], we can obtain values for 
ranking the reference groups by solving: 

7 = 7 P' ( 5 ) 

where P' = P12 x P21 is a stochastic matrix and 7 = 
(71,..., 7 t). Note that matrix P' has dimension T xT only 
and can be easily solved by standard Markov chain tech¬ 
niques such as the GTH algorithm [I]. Then, from Equa¬ 
tion o we obtain the reputation of all venues where the 
reference groups publish. 


iz = 7 X P12 


( 6 ) 


This vector of venue P-scores can be used to rank authors 
(or even research groups) one want to compare. But, before 
continue the development of the P-score model, it is conve¬ 
nient to discuss a small example to illustrate the notation. 


2.4 Example 

Figure [T] illustrates the Markov chain associated with a 
small example composed of two reference research groups 
and three publication venues. In this example, faculty mem¬ 
bers of Group 1 published a total of six papers, three of 





















which in venue ui, two in venue V2, and one in venue V3. 
Venue vi got also two papers from faculty of Group 2. Since 
venue Wi has a total of five papers from Groups 1 and 2, its 
reputation is distributed to the two groups proportionally to 
the number of papers from each. The remaining publication 
patterns are shown in the figure. 



Figure 1: Markov chain for a small example with 2 
research groups and 3 venues. 


Consider also that we have the number of authors that 
publish in each venue as an additional information. In our 
example, assume that venues 1, 2 and 3 receive publications 
from 10, 60 and 20 distinct authors, respectively. Our intu¬ 
ition is that venues with a larger number of distinct authors 
are better than venues with a small number of authors (i.e., 
we penalize venues that are recognized by a few authors). 
We refer to this effect as the publication breadth of the venue. 
This information is modeled through the dangling node D 
and the parameter d € [0, 1], which we use to balance the 
relative importance of publication volume and publication 
breadth in the model. If d = 1 then only publication vol¬ 
ume is considered. If d = 0 then only publication breadth is 
considered. For effect of illustration, consider that d = 1/3 
in our small example of Figure [T] Then, we can write an 
stochastic transition matrix P as follows: 


P = 



Given P, we can compute the steady state probabilities as¬ 
sociated with each venue to obtain the vector v of all venues: 


u =(0.189,0.590,0.221) (7) 

= (0.320,1.000,0.375) (8) 

The values in vector v are the venue P-scores. In our exam¬ 
ple, venue V2 has the highest P-score, followed by V3, and 
then by vi. We remark that the individual values give the 
relative importance of each venue with respect to V 2 - 


2.5 Comparing Authors 

Once the vector u of venue P-scores has been computed, 
we can easily compute a rank TZ for each author a in a set 
of authors A we want to compare as: 


IZl^a G A') 


Sa 

maxig^iSi} 


(9) 


where Sa {a £ A) is a weighted sum of P-scores associated 
with author a in set A, computed as: 


V 

Sa = ^^0, X N{a,Vj) (10) 

where Vj is the weight (or P-score value) of venue Vj accord¬ 
ing to o and N{a,Vj) is the total number of publications 
from author a in venue Vj. 


3. DISCUSSION 

We have proposed an metric to assess academic produc¬ 
tivity, which we call P-score, given it is based just on the 
publication patterns of research groups. The basic idea of 
P-score is to associate a reputation with publication venues 
based on the publication patterns of reference groups, com¬ 
posed by researchers, in a given area of knowledge. Although 
the choice of reference groups can be made by using available 
citation data, the P-score metric itself does not depend on 
citation data. It uses just publication records of researchers 
and research groups, i.e. the papers and the venues where 
they published in. Preliminary experiments suggest that re¬ 
sults have strong correlation with citation-based metrics and 
yet, have some complementarity to them, something we are 
further investigating. 
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